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BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to the transmission of 
information over a communications path. More particu- 
larly, the present invention relates to the communica- 10 
tions of high bandwidths information over networks of 
varying types. 

2. Art Background 

Until recently, telecommunications and computing 
were considered to be entirely separate disciplines. 15 
Telecommunications was analog and done in real time 
whereas computing was digital and performed at a rate 
determined by the processing speed of a computer. 
Today, such technologies as speech processing, elec- 
tronic mail and facsimile have blurred these lines. In the 20 
coming years* computing and telecommunications will 
become almost indistinguishable in a race to support a 
broad range of new multimedia (i.e., voice, video and 
data) applications. These applications are made possible 
by emerging digital-processing technologies, which 25 
include: compressed audio (both high fidelity audio and 
speech), high resolution still images, and compressed 
video. The emerging technologies will allow for collab- 
oration at a distance, including video conferencing. 

Of these technologies, video is particularly exciting in 30 
terms of its potential applications. But video is also the 
most demanding in terms of processing power and sheer 
volume of data to be processed. Uncompressed digital 
video requires somewhere between 50 and 200 Mb/s 
(megabits per second) to support the real-time transmis- 35 
sion of standard television quality images. This makes 
impractical the widespread use of uncompressed digital 
video in telecommunications applications. 

Fortunately, there is considerable redundancy in 
video data, both in terms of information theory and 40 
human perception. This redundancy allows for the 
compression of digital video sequences into lower trans- 
mission rates. For some time, researchers have been 
aware of a variety of techniques that can be used to 
compress video data sequences anywhere from 2:1 to 45 
1000:1, depending on the quality required by the appli- 
cation. Until recently, however, it was not practical to 
incorporate these techniques into low cost video-based 
applications. 

A number of standards have been recently developed 50 
for such activities as video conferencing, the transmis- 
sion and storage of standard high quality still imay^ as 
well as standards for interactive video playback to pro- 
vide interoperability between numerous communica- 
tions points. The standards recognize a need for quality 55 
video compression to reduce the tremendous amount of 
data required for the transmission of video information. 

Two important methods of data compression for 
video information are used widely throughout the vari- 
ous standards for video communication. These are the 60 
concepts of frame differencing and motion, compensa- 
tion. Frame differencing recognizes that a normal video 
sequence has little variation from one frame to the n^xtL 
If, instead of coding each frame, only the differences 
between a frame and the previous frame are coded, then 65 
the amount of information needed to describe the new 
frame will be dramatically reduced. Motion compensa- 
tion recognizes that much of the difference that does 
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occur between successive frames can be characterized 
as a simple translation of motion, caused either by the 
moving of objects in the scene or by a pan of the field of 
view. Rather than form a simple difference between 
5 blocks in a current frame and the same block in the 
previous frame, the area around those blocks can be 
searched in the previous frame to find an offset block 
that more closely matches the block of the current 
frame. Once a best match has been identified, the differ- 

10 ence between a reference block in the current frame and 
the best match in the previous frame are coded to pro- 
duce a vector that describes the offset of the best match. 
This motion vector then can be used with the previous 
frame to produce the equivalent of what the current 
frame should be. These methods, and others are incor- 
porated into systems which make possible the rapid 
transmission of real-time video information. 
As the worlds of telecommunications and computers 

20 blend closely together, the telecommunications aspects 
of communications will have to contend with some of 
the constraints of the computer world. Particularly, 
video conferencing over existing computer networks 
will prove a challenge in that maintaining real time 

25 information communication over traffic-burdened exist- 
ing network protocols may prove insurmountable. 

Current video algorithms assume a nearly constant 
bandwidth availability for the encoding of video infor- 
mation. This is evidenced by the use of only a single 

3Q output buffer for traditional video encoder output It is 
common to use the output buffer fullness as a feedback 
parameter for encoding subsequent images; Le^ with 
higher or lower levels of quantization. A well-known 
effect resulting from using a single output buffer is 

35 called "bit-bang" where the output buffer is over de- 
pleted by the interface to the communications channel, 
causing the feedback loop to indicate that the buffer can 
handle lots of data, which in turn causes the video com- 
pression algorithm to under optimize the subsequent 

40 image coding. The user perceives the bit-bang as an 
uneven quality and frame rate. 

To alleviate bit-bang, the typical approach has been 
to limit the amount of data pulled out from the encoder 
video output buffer to a fraction of the total size of the 

45 output buffer, 10% to 30% is typical. This approach 
keeps the feedback indicator rather small, and encoding 
more uniform. The underlying assumption of this ap- 
proach is that the communications channel will usually 
not be changing rapidly. Exceptions are caused by con- 

50 nectivity interruptions, such as burst errors, which are 
handled strictly as exceptions to the call. In a local area 
network (LAN), or other collision-sensing multiple 
access channel, or in other networks with burst charac- 
teristics (such as noisy RF channels), this underlying 

55 assumption no longer holds. Over these sorts of commu- 
nications channels, unanticipated transmission delays 
may result in bit-bang problems which are not so readily 
overcome by limiting the size of the feedback buffer. 
Thus, video jerkiness will result in real-time video com- 

60 munication over such channels. It would be advanta- 
geous, and is therefore an object of the present inven- 
tion to provide a video transmission mechanism which 
can be accommodated on such potential bursty net- 
works. 

65 

SUMMARY OF THE INVENTION 
From the foregoing, it can be appreciated that there is 
a need for a mechanism of incorporating real-time video 
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data communication over traditional network protocols 
to smooth video transmission. It is therefore an object of 
the present invention to provide a method and appara- 
tus for the conveyance of video data over such net- 
works as local area networks. 

These and other objects of the present invention are 
provided by introducing feedback between the video 
CODEC and the intended communications channel 
such that the characteristics of the channel are used to 
drive multiple video output buffers. These multiple 
output buffers share an original temporal video refer- 
ence, but have different subsequent temporal video 
images. The communications channel interface then 
picks the subsequent video image buffer that best 
matches the current conditions experienced by it By 
using a predictor of the channel performance, the video 
algorithm can be tuned to provide video output buffers 
with the best guess of how the buffers should be config- 
ured. A number of subsequent histories of an image are 
buffered until the receiving channel indicates it is ready 
to receive the next. Then the appropriate output buffer 
having the corresponding temporal change in the video 
is used to supply the next frame change information to 
the receiving station. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The objects, features and advantages of the present 
invention will be apparent from the following detailed 
description in which: 

FIG. 1 demonstrate a hypothetical network having a 30 
plurality of video-capable nodes for interacting and 
providing video conferencing capabilities. 

FIG. 2 illustrates hardware to be utilized in imple- 
menting the present invention in one embodiment. 

FIG. 3 illustrates a logical rendition of a plurality of 35 
output buffers with successive time interval video infor- 
mation for one embodiment of the present invention. 

FIG. 4 illustrates a branching tree structure corre- 
sponding to successive temporal transmit reference 
images for one embodiment of the present invention. 40 

FIG. 5 illustrates alternative logical output buffer 
uses for channel dependent data transmission over a 
network. 

FIG. 6 illustrates characteristics of audio information 
which may be transmitted over a network in accor- 45 
dance with another embodiment of the present inven- 
tion. 

FIG. 7 illustrates a generalized block diagram of the 
present invention. 

DETAILED DESCRIPTION OF THE 50 
INVENTION 

A method and apparatus are described for the con- 
veyance of real-time isochronous data over bursty net- 
works. Although the present invention is described 55 
predominantly in terms of the transmission of video 
information, the concepts and method are broad enough 
to encompass the transmission of real-time audio and 
other data requiring isochronous data transfer. 
Throughout this detailed description, numerous details 60 : 
are specified such as bit rates and frame sizes, in order to 
provide a thorough understanding of the present inven- i 
tion. To one skilled in the art, however, it will be under- 1 
stood that the present invention may be practiced with- * 
out such specific details. In other instances, well-known 65 i 
control structures and gate level circuits have not been 
shown in detail in order not to obscure unnecessarily < 
the present invention. Particularly, some functions are i 
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described to be carried out by various logic circuits. 
Those of ordinary skill in the art, having been described 
the various functions will be able to implement the 
necessary logic circuits without undue experimentation. 

5 FIG. 1 is used to illustrate a simple network having a 
plurality of videocapable nodes. The network is illus- 
trated as a simple star network 10 having a centrally 
incorporated multi-point control unit (MCU). The net- 
work is presented as having five (5) nodes 11, 12, 13, 14 

10 and 15. For the purposes of explanation, these will all be 
considered video-capable nodes, with nodes 12 and 13 
supporting IRV video (160 pixels X 120 lines) while 
nodes 11, 14 and 15 support HQV video (320 pix- 
els X 240 lines). The network illustrated in FIG. 1 is 

IS purely for illustrative purposes and many more complex 
nodes may be incorporated that are non-video capable 
on the same network as the illustrated nodes. Further, 
the present invention may be applied to any network 
configuration besides the star configuration of FIG. 1 

20 such as token ring networks, branching tree networks, 
etc. The fundamental requirement for the network 
which has these video-capable nodes is that the nodes 
be able to transmit data, including video data from one 
point to another and receive acknowledgments from the 

25 receiving node. 

FIG. 2 illustrates typical video encoding hardware to 
which the present invention may be applied. This can be 
used for preparing video data to be transmitted over a 
network of the type illustrated in FIG. 1 to provide 

30 real-time video conferencing. A video camera 20 re- 
ceives the video image that is to be encoded and con- 
veyed. Such cameras are common and work on a num- 
ber of technologies such as charge coupled devices, etc. 
The video camera may directly include video CODEC 

35 21 or it may be tightly coupled as illustrated in the 
figure. The video CODEC 21 receives the electronic 
image from the camera and digitizes the image when 
being used in its encoder capacity. Video CODECS are 
generally known and come in a number of varieties 

40 which may be used for encoding video data to be trans- 
mitted and decoding video data when received. In FIG. 
2, the camera output is propagated to the capture buffer 
22 of video CODEC 21. 
From the capture buffer 22, the video information is 

45 processed by motion estimation circuitry 23. The mo- 
tion estimation circuitry is used to generate motion 
vectors which describe the difference of a portion of a 
video image from the previously recorded image in 
terms of a translational offset The motion estimation 

50 circuitry compares the currently decoded frame from 
the previous frame stored in the transmit reference 
image buffer 30 about which more will be described 
further herein. From the motion estimation circuitry, 
the outputs are the motion vectors and the motion com- 

55 pensated image 24. The motion compensated image 24 
is then processed by the differential pulse code modula- 
tion (DPCM) circuitry 25 which generates digital infor- 
mation of the changes to the previously stored transmit 
reference image. Finally, a final stage of coding is done 

60 at transform coding block 26 which also performs quan- 
tization and run-length encoding. Run-length encoding 
is a technique for compressing data sequences that have 
large numbers of zeros and is well-known to those of 
ordinary skill in the art. This transform coder may per- 

65 form a discrete cosine transform (DCT). 

From the transform coding block, the coded se- 
quence is propagated to the output buffer 27 which is 
used to maintain a constant bit rate for the output to the 
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network. As was described, prior art methods used the 
output buffer fullness to regulate the degree of quantiza- 
tion that would be applied to the compressing and en- 
coding circuitry because a constant bandwidth avail- 
ability was assumed. 5 

The transform coder block 26 also outputs the com- 
pressed image data to generate a new transfer reference 
image for storage in the transmit reference image buffer. 
The encoding logic provides the compressed image 
data to a decoding block 28 that has an inverse quan- 10 
tizer and inverse discrete cosine transform decoder 
which can be used to combine the decoded image data 
with the previously stored transfer reference image to 
yield a new transmit reference image which corre- 
sponds to the image that was most recently propagated 15 
on the network. It is this image data that would be used 
in calculating the changes in the image in sending the 
next frame of information. In other words, the transmit 
reference image, which is the same image that will be 
reconstructed at the other end by the video decoder, is 
used as the basis of subsequent encoding, including 
motion vectors and motion compensated image com- 
pression. 

As was described in the previous section, the prior art ^ 
feedback mechanism using the output buffer assumed a 
constant bit rate would be available for the transmission 
of information. This assumption no longer holds for 
video conferencing type devices which are on bursty 
networks such as CSMA LAN networks. The solution ^ 
proposed by the present invention is to provide feed- 
back between the video CODEC and the communica- 
tions channel such that the characteristics of the chan- 
nel are used to drive multiple video output buffers. 
These buffers share an original temporal video refer- 35 
ence but will have different subsequent temporal video 
images. The communications channel interface then 
picks the subsequent video image buffer that best 
matches the current condition. By using a predictor of 
the channel performance, the video algorithm can be 40 
tuned to provide video output buffers with the best 
guess of how the buffers should be configured. Once a 
particular output buffer's image data is selected, the 
remaining buffers can be flushed to be refilled again 
based on a newly calculated transmit reference image. 4s 
In the limit, the final action is to revert to an exception 
handler similar to current video CODECS, i.e., insert a 
key frame to restart the encoding of video data trans- 
mission. 

FIG. 3 illustrates conceptually the logical multiple 50 
output buffers of the present invention. When the video 
camera 20 records an image it is encoded by the encod- 
ing circuitry described above and the encoded informa- 
tion is propagated to the output buffer 27. In a bursty 
network, the network may not be able to receive this 55 
newly calculated image data. Accordingly, the camera 
continues to detect images and encode the data and 
newly translated data is stored in subsequent output 
buffers such as 41, 42 or 43. For example, the informa- 
tion stored in the output buffer 27 may correspond to 60 
the digital information equivalent to the changes from 
the transmit reference image stored in the transmit ref- 
erence image buffer 30 at time t=0. In output buffer 41, 
the data information may correspond to the difference 
between the transfer reference image and l/15th of a 65 
second later than the data information stored in buffer 
27. Likewise, output buffers 42 and 43 may store data 
corresponding to the temporal change between the 
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transmit reference image and the image before the cam- 
era at successively later times. 

The video encoder and camera circuitry described 
may be incorporated as part of a station that is on the 

5 network and are responsive to information received 
over the communications channel. When a given node 
again has the bus, the output buffer with the most cur- 
rent image may be signaled to transmit its information 
to the receiving node. Likewise, the channel informa- 

1° tion is used to then calculate the next transmit reference 
image for storage. The output buffers are then flushed 
and are again loaded in a time sequential manner until 
the data is again ready to be sent over the network. 
While four (4) output buffers are illustrated, this is 

13 purely for illustrative purposes in that as many buffers 
may be implemented as computing power and resources 
provide. 

FIG. 4 illustrates conceptually a branching tree that 

2Q is pruned at times T= l f T=2, T=3, etc., for each slice 
of information that is taken and propagated on the net- 
work. This conceptualizes the use of multiple output 
buffers as a tree which is continually pruned with the 
most current pruning corresponding to the present 

^ transfer reference image. 

FIG. 5 illustrates another conceptualization of the 
present invention. The encoder, through feedback from 
the data communications channel, creates several logi- 
cal output buffers corresponding to behavioral predic- 

^ tions based on the feedback from the communications 
channel For example, logical output buffer 1 could 
represent the case where more bandwidth will be dy- 
namically allocated to this natural data compression 
over the next unit of time. The unit of time could be an 

35 image frame or, for example, a frame of sampled audio. 
In FIG. 5, the various predictions of the bandwidth 
available to the compression algorithm are shown 
below in Table I. 



TABLE I 



40 



Logical Output Buffer 


Prediction of Bandwidth per Unit 
Time Relative to Current Transmit 
Reference 


1 


about the same 


2 


a lot more 


3 


more 


4 


a lot less 



For video coding, more bandwidth could be used to 
get sharper images and/or higher frame rate. The actual 

50 data contained in the logical output buffers can be sig- 
nificantly different, too. For example, in video coding, 
the new transmit reference might be calculated from 
different input images in time and/or spatial resolution. 
Logical output buffer 1 might represent the data from 

55 an image taken 1/1 5th of a second later than transmit 
reference 0, while logical output buffer 2 might repre- 
sent the differential coding from an image half a second 
later from transmit reference 0. Such an approach 
would be good for video coding for channels where the 

60 bit rate allocated to video may undergo extreme fluctu- 
ations such as in the bursty networks described above. 

While with reference to FIGS. 2 and 3, the output 
buffers are illustrated as, for example, discrete memory 
elements. FIG. 5 makes it clear that logical buffers may 

65 be created in a common block of memory and that the 
number of such buffers is limited only by the available 
computational power to simultaneously encode them 
and the memory to sufficiently handle them. FIG. 6 is 
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used to illustrate that the present invention is not neces- 
sarily limited to video encoding and illustrates a frame 
of audio information- For example, in the G.728 stan- 
dard each frame of data is 5 milliseconds long. The 
frame may be stored as a transmit reference and subse- 5 
quent transmissions may follow the differential coding 
principals wherein only the changed information is sent 
to the receiving node. The audio encoder may be re- 
sponsive to feedback from the network and plantain a 
plurality of logical output buffers such as those de- 10 
scribed in the video application. One possible applica- 
tion for such an implementation would be in wireless 
telephony wherein portions of an audio transmission 
may be lost when a transmitting station goes through a 
tunnel. The responding network indicates that its most 15 
recently received information is slightly stale and that a 
late change logical output buffer should be used in pro- 
viding the encoded differential information. 

In a more general description of the present inven- ^ 
tion, reference is now made to FIG. 7. Information 
about a real-time object 100 that is desired to be con- 
veyed from a transmitting node to a receiving node on 
some sort of network is shown. This real-time object 
100 may be a video image or it may be a sound depend- 25 
ing on the particular implementation. A capture mecha- 
nism 110 detects the real-time object and encodes it into 
electronic information. The capture mechanism may be 
a camera for video information as described above or a 
microphone or stereo microphones for audio informa- 30 
tion. This information is then processed by differential 
encoder 115 which compares the newly captured real- 
time object to the previously stored recorded object in 
transmit reference buffer 120. The differentially en- 
coded data is then propagated to a logical output buffer 35 
125 which operates as those described above. When the 
network clears the output buffers for transmission, the 
particular output buffer having the best information 
conveys it over the network and that same information 
is used to calculate a new transmit reference to be stored 40 
in transmit reference buffer 120. 
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